Learning with the Set Covering Machine
نویسندگان
چکیده
We generalize the classical algorithms of Valiant and Haussler for learning conjunctions and disjunctions of Boolean attributes to the problem of learning these functions over arbitrary sets of features; including features that are constructed from the data. The result is a general-purpose learning machine, suitable for practical learning tasks, that we call the Set Covering Machine. We present a version of the Set Covering Machine that uses generalized balls for its set of data-dependent features and compare its performance to the famous Support Vector Machine. By extending a technique pioneered by Littlestone and Warmuth, we bound its generalization error as function of the amount of data compression it achieves during training.
منابع مشابه
A Hybrid Machine Learning Method for Intrusion Detection
Data security is an important area of concern for every computer system owner. An intrusion detection system is a device or software application that monitors a network or systems for malicious activity or policy violations. Already various techniques of artificial intelligence have been used for intrusion detection. The main challenge in this area is the running speed of the available implemen...
متن کاملMachine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملReal-time Scheduling of a Flexible Manufacturing System using a Two-phase Machine Learning Algorithm
The static and analytic scheduling approach is very difficult to follow and is not always applicable in real-time. Most of the scheduling algorithms are designed to be established in offline environment. However, we are challenged with three characteristics in real cases: First, problem data of jobs are not known in advance. Second, most of the shop’s parameters tend to be stochastic. Third, th...
متن کاملAnalyzing new features of infected web content in detection of malicious web pages
Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...
متن کاملModeling of Chloride Ion Separation by Nanofiltration Using Machine Learning Techniques
In this work, several machine learning techniques are presented for nanofiltration modeling. According to the results, specific errors are defined. The rejection due to Nanofiltration increases with pressure but decreases with increasing the concentration of chloride ion. Methods of machine learning represent the rejection of nanofiltration as a function of concentration, pH, pressure and also ...
متن کامل